本文模拟了17种语言的低资源设置,以评估不同条件下的相似性,稳定性和可靠性。目的是在训练之前使用语料库相似性度量,以预测训练后嵌入的特性。本文的主要贡献是表明可以使用上游语料库相似性度量来预测下游嵌入相似性。然后,通过建模从非常有限的训练数据创建的嵌入式的可靠性,将此发现应用于低资源设置。结果表明,可以使用语料库相似性度量估算低资源嵌入的可靠性,这些度量在少量数据上保持强大。这些发现对评估真正低资源语言的评估具有重大影响,在这种语言中,由于数据限制,这种系统的下游验证方法是不可能的。
translated by 谷歌翻译
In this paper we discuss the theory used in the design of an open source lightmorphic signatures analysis toolkit (LSAT). In addition to providing a core functionality, the software package enables specific optimizations with its modular and customizable design. To promote its usage and inspire future contributions, LSAT is publicly available. By using a self-supervised neural network and augmented machine learning algorithms, LSAT provides an easy-to-use interface with ample documentation. The experiments demonstrate that LSAT improves the otherwise tedious and error-prone tasks of translating lightmorphic associated data into usable spectrograms, enhanced with parameter tuning and performance analysis. With the provided mathematical functions, LSAT validates the nonlinearity encountered in the data conversion process while ensuring suitability of the forecasting algorithms.
translated by 谷歌翻译
Estimating the 6D pose of objects is one of the major fields in 3D computer vision. Since the promising outcomes from instance-level pose estimation, the research trends are heading towards category-level pose estimation for more practical application scenarios. However, unlike well-established instance-level pose datasets, available category-level datasets lack annotation quality and provided pose quantity. We propose the new category level 6D pose dataset HouseCat6D featuring 1) Multi-modality of Polarimetric RGB+P and Depth, 2) Highly diverse 194 objects of 10 household object categories including 2 photometrically challenging categories, 3) High-quality pose annotation with an error range of only 1.35 mm to 1.74 mm, 4) 41 large scale scenes with extensive viewpoint coverage, 5) Checkerboard-free environment throughout the entire scene. We also provide benchmark results of state-of-the-art category-level pose estimation networks.
translated by 谷歌翻译
In the contemporary media landscape, with the vast and diverse supply of news, it is increasingly challenging to study such an enormous amount of items without a standardized framework. Although attempts have been made to organize and compare news items on the basis of news values, news genres receive little attention, especially the genres in a news consumer's perception. Yet, perceived news genres serve as an essential component in exploring how news has developed, as well as a precondition for understanding media effects. We approach this concept by conceptualizing and operationalizing a non-discrete framework for mapping news items in terms of genre cues. As a starting point, we propose a preliminary set of dimensions consisting of "factuality" and "formality". To automatically analyze a large amount of news items, we deliver two computational models for predicting news sentences in terms of the said two dimensions. Such predictions could then be used for locating news items within our framework. This proposed approach that positions news items upon a multidimensional grid helps in deepening our insight into the evolving nature of news genres.
translated by 谷歌翻译
Previous attempts to predict stock price from limit order book (LOB) data are mostly based on deep convolutional neural networks. Although convolutions offer efficiency by restricting their operations to local interactions, it is at the cost of potentially missing out on the detection of long-range dependencies. Recent studies address this problem by employing additional recurrent or attention layers that increase computational complexity. In this work, we propose Axial-LOB, a novel fully-attentional deep learning architecture for predicting price movements of stocks from LOB data. By utilizing gated position-sensitive axial attention layers our architecture is able to construct feature maps that incorporate global interactions, while significantly reducing the size of the parameter space. Unlike previous works, Axial-LOB does not rely on hand-crafted convolutional kernels and hence has stable performance under input permutations and the capacity to incorporate additional LOB features. The effectiveness of Axial-LOB is demonstrated on a large benchmark dataset, containing time series representations of millions of high-frequency trading events, where our model establishes a new state of the art, achieving an excellent directional classification performance at all tested prediction horizons.
translated by 谷歌翻译
Tasks critical to enterprise profitability, such as customer churn prediction, fraudulent account detection or customer lifetime value estimation, are often tackled by models trained on features engineered from customer data in tabular format. Application-specific feature engineering adds development, operationalization and maintenance costs over time. Recent advances in representation learning present an opportunity to simplify and generalize feature engineering across applications. When applying these advancements to tabular data researchers deal with data heterogeneity, variations in customer engagement history or the sheer volume of enterprise datasets. In this paper, we propose a novel approach to encode tabular data containing customer transactions, purchase history and other interactions into a generic representation of a customer's association with the business. We then evaluate these embeddings as features to train multiple models spanning a variety of applications. CASPR, Customer Activity Sequence-based Prediction and Representation, applies Transformer architecture to encode activity sequences to improve model performance and avoid bespoke feature engineering across applications. Our experiments at scale validate CASPR for both small and large enterprise applications.
translated by 谷歌翻译
Commonly used AI networks are very self-confident in their predictions, even when the evidence for a certain decision is dubious. The investigation of a deep learning model output is pivotal for understanding its decision processes and assessing its capabilities and limitations. By analyzing the distributions of raw network output vectors, it can be observed that each class has its own decision boundary and, thus, the same raw output value has different support for different classes. Inspired by this fact, we have developed a new method for out-of-distribution detection. The method offers an explanatory step beyond simple thresholding of the softmax output towards understanding and interpretation of the model learning process and its output. Instead of assigning the class label of the highest logit to each new sample presented to the network, it takes the distributions over all classes into consideration. A probability score interpreter (PSI) is created based on the joint logit values in relation to their respective correct vs wrong class distributions. The PSI suggests whether the sample is likely to belong to a specific class, whether the network is unsure, or whether the sample is likely an outlier or unknown type for the network. The simple PSI has the benefit of being applicable on already trained networks. The distributions for correct vs wrong class for each output node are established by simply running the training examples through the trained network. We demonstrate our OOD detection method on a challenging transmission electron microscopy virus image dataset. We simulate a real-world application in which images of virus types unknown to a trained virus classifier, yet acquired with the same procedures and instruments, constitute the OOD samples.
translated by 谷歌翻译
在过去的几年中,神经网络(NN)从实验室环境中发展为许多现实世界中的最新问题。结果表明,NN模型(即它们的重量和偏见)在训练过程中的重量空间中的独特轨迹上演变。随后,这种神经网络模型(称为模型动物园)的人群将在体重空间中形成结构。我们认为,这些结构的几何形状,曲率和平滑度包含有关训练状态的信息,并且可以揭示单个模型的潜在特性。使用这种模型动物园,可以研究(i)模型分析的新方法,(ii)发现未知的学习动力学,(iii)学习此类人群的丰富表示形式,或(iv)利用模型动物园来用于NN权重和NN权重的生成模型偏见。不幸的是,缺乏标准化模型动物园和可用的基准可以显着增加摩擦,以进一步研究NNS人群。通过这项工作,我们发布了一个新颖的模型动物园数据集,其中包含系统生成和多样化的NN模型种群,以进行进一步研究。总共提出的模型动物园数据集基于八个图像数据集,由27个模型动物园组成,该模型动物园训练有不同的超参数组合,包括50'360唯一的NN型号以及其稀疏双胞胎,导致超过3'844'360收集的型号。 。此外,对于模型动物园数据,我们提供了对动物园的深入分析,并为多个下游任务提供了基准。该数据集可在www.modelzoos.cc上找到。
translated by 谷歌翻译
给定模型动物园的神经网络权重的学习表示是一个新兴而具有挑战性的领域,从模型检查到神经体系结构搜索或知识蒸馏,具有许多潜在的应用。最近,在模型动物园进行训练的自动编码器能够学习一个超代理,该代表体捕获了动物园中模型的内在和外在特性。在这项工作中,我们扩展了超代表,以供生成使用以采样新的模型权重。我们提出的是层损失归一化,我们证明,这是基于超代表拓扑生成高性能模型和几种采样方法的关键。使用我们的方法生成的模型是多种多样的,性能的,并且能够超过强大的基准,从而在下游任务上进行了评估:初始化,合奏采样和传递学习。我们的结果表明,通过超代理通过过度代理,知识聚集从模型动物园到新模型的潜力,从而为新的研究方向铺平了途径。
translated by 谷歌翻译
检测会计异常是财务报表审核中的反复挑战。最近,已经提出了源自深度学习(DL)的新方法来审核声明的基本会计记录的大量。但是,由于它们的大量参数,这种模型表现出固有不透明的缺点。同时,隐藏模型的内部运作通常会阻碍其现实世界的应用。该观察结果在财务审计中尤其如此,因为审计师必须合理地解释和证明其审计决定是合理的。如今,已经提出了各种可解释的AI(XAI)技术来应对这一挑战,例如Shapley添加说明(Shap)。但是,在经常在财务审核中应用的无监督DL中,这些方法在编码变量级别上解释了模型输出。结果,人类审计师通常很难理解自动编码器神经网络(AENNS)的解释。为了减轻此缺点,我们提出(重塑),该属性在汇总属性级别上解释了模型输出。此外,我们引入了一个评估框架,以比较XAI方法在审计中的多功能性。我们的实验结果表明,经验证据表明,与最先进的基线相比,重塑结果是多功能解释的。我们将这种属性级别的解释视为在财务审计中采用无监督的DL技术的必要下一步。
translated by 谷歌翻译